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ABSTRACT. Based on the heuristics that maintaining presumptions can be 
beneficial in uncertain environments, we propose a set of basic requirements 
for learning systems to incorporate the concept of prejudice. The simplest, 
memoryless model of a deterministic learning rule obeying the axioms is con- 
structed, and shown to be equivalent to the logistic map. The system's perfor- 
mance is analysed in an environment in which it is subject to external ran- 
domness, weighing learning defectiveness against stability gained. The cor- 
responding random dynamical system with inhomogeneous, additive noise is 
studied, and shown to exhibit the phenomena of noise induced stability and 
stochastic bifurcations. The overall results allow for the interpretation that 
prejudice in uncertain environments can entail a considerable portion of stub- 
bornness as a secondary phenomenon. 



1. Introduction 

As almost all terms denoting affects, the term 'prejudice' is as ubiquitous as 
ill-defined, and resists naive attempts to provide it with meaning in any more 
epistemologically rigorous sense. Yet the last century saw, with the advent of 
game theory as a means of formalisation and modelling, the paving of a scien- 
tific access path to such notions, partially as a side effect of the growing interest 
in the behaviour of intelligent beings (usually called agents) in social environ- 
ments |1|, which in turn was spurred by the mathematisation of economics. 
With the emergence of the Internet as a mass medium, this kind of research 
has obtained a new test bed, a source of statistical data, and an independent 
study object, and has thus gained further impetus (HOU 41- Consequently, hard 
science is lead to occupy itself with formerly foreign concepts from the psycho- 
logical and sociological domains. Recent efforts in this direction combine game 
theory and logic with nonlinear systems theory and stochastics, and some of 
them are quite bold, as for instance |5|. The present study parallels these lines 
of thought, and is, to the best of our knowledge, the first one to focus on the 
concept of prejudice. It is based on the report [6 1, but is essentially enlarged 
and refined. 

SectionElprovides the theoretical framework on which the subsequent con- 
struction of a model for prejudice rests. We set up a simple, general model for 
an environment in which an agent needs to learn a fluctuating risk, and argue 
that prejudice might be beneficial in such circumstances. 
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The special instance of a prejudiced learning rule we consider is constructed 
in Section|3] In particular, we define what we mean by a prejudiced determin- 
istic learning rule, in the language of reinforcement learning. We then set out 
three axioms we regard as heuristically sound for prejudiced learning in en- 
vironments polluted with noise. We then look for the most simple prejudiced 
learning rule fulfilling them, a rule which in particular does not require the 
agents to have memory. In the noiseless case, the ensuing rule turns out to be 
equivalent to the logistic map. In correspondence to the well known dynamical 
features of this map we classify the general behaviour of prejudiced learners, 
and obtain restrictions on their internal parameters. When the input to the 
prejudiced learning rule is subject to random fluctuations, it still retains the 
form of the logistic map, but with additively coupled, inhomogeneous noise. 

To corroborate the argument that the limitation of rationality presented by 
prejudiced learning can be beneficial in certain circumstances, a numerical 
analysis of the performance of a subclass of prejudiced learners in an uncer- 
tain environment is carried out in Section |U for various levels of noise. The 
result is twofold. On the one hand, the prejudice causes the agents to make 
a small error in their belief about the true risk, generally overestimating it. 
On the other hand, prejudiced learning can efficiently stabilise an agent's be- 
haviour, in particular for higher values of the logistic map's single dynamical 
parameter. Therefore, if both factors are taken into account, prejudiced learn- 
ing has the potential to be advantageous in noisy environments. 

Our special prejudiced learning rule with noise is an example for a noisy 
dynamical system, a class of systems which has attracted a lot of interest from 
physicists and mathematicians in recent years. Reference 1 7 1 is one earlier, 
seminal work. Noisy, or random dynamical systems show a host of additional 
phenomenology over ordinary ones, including phenomena that are to be ex- 
pected from natural systems. Furthermore, they present a combination of 
nonlinearity and stochastics on which now a whole branch of mathematics 
thrives, see [ 8 1 and its vast bibliography. The system defined by the prejudiced 
learning rule is therefore interesting in its own right, and we devote Section|5] 
to its study. In it, we rediscover the phenomenon of stochastic resonance or 
noise induced stability [9, 10 1, for a large range of parameter values and noise 
levels, as well as the phenomenon of stochastic bifurcations or noise induced 
transitions [11, 12 1 as these parameters vary. Both phenomena have been em- 
pirically confirmed in a vast variety of natural systems and models thereof, 
ranging from biophysics |13| and chemistry |14|, over financial markets 1 15 1 
and signal processing |16|, to quantum information theory |17|, without any 
claim to completeness. We determine the stability domain of our prejudiced 
learning system analytically using the Lyapunov exponent, and study the bi- 
furcating transition from that domain, which is evoked by lowering the noise 
level, using numerics. In particular, we determine the critical exponent of this 
transition, in analogy to concepts of statistical mechanics. 

Finally, Section HO contains a comprising assessment of our model for preju- 
diced learning. It concludes with some suggestions for further research to be 
based on this and similar models. 

2. Background and Heuristics 

To place our work into theoretical context and delineate its scope we briefly 
recapitulate some background. The framework for the construction of our 
model for prejudiced learning is that of basic game theory and reinforcement 
learning If81 ll9l . We briefly sketch the necessary background. 
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The classical, game-theoretical subject of decision making under uncertainty 
considers single- or multi player games of intelligent agents against an envi- 
ronment (nature). Each agent has an utility function depending on the action 
he takes and the (unknown) state of nature. This function describes the payoff 
of the execution of the corresponding action given that nature is in the cor- 
responding state. The utility function is taken to be the input for the agent's 
decision rule which uniquely determines the action to be taken. In the more re- 
alistic case when a probability distribution over the states of nature is known 
to the agent, one is in the realm of statistical decision theory. There, this knowl- 
edge is fed into a selection scheme which determines one from a set of decision 
rules accordingly. For example, the well known Bayesian decision rule is the 
combination of a selection scheme and a decision rule which assigns to each 
action the average sum of utilities weighted with the known probability dis- 
tribution, and then chooses the action maximising this value. The knowledge 
can be a priori or learnt by experimentation using some statistical learning 
rule, which can be as simple as taking means over a finite set of experimental 
outcomes {Bayesian learning). When the result of taking a specific action is 
fed back as the outcome of an experiment into the learning rule and the whole 
cycle is repeated many times, the agent becomes a learning automaton. This 
is the basic object we consider. 

The Bayesian rule is obviously not the only possible, there is a multitude of 
different learning rules and selection schemes. Numerous statistical learning 
rules have been considered in pursuit of the optimal one in a given environ- 
ment, see, e.g., [20, 21 1, and references therein. In realistic cases it can be sen- 
sible to choose selection schemes which differ significantly from the apparently 
most rational Bayesian rule. Here, we propose a model which can justifiably be 
said to represent agents who are prejudiced by construction and in behaviour. 
We derive the heuristics for our construction from a key example: 

Example. At each time step, the agent takes an action, selected from a finite 
set A, with a certain payoff whose maximal value we normalise to 1, for sim- 
plicity. Choosing action k, there is a — generically small — probability pk for 
the occurrence of a damage d k which diminishes the maximal payoff to 1 — d k . 
Assume that the actual damage is symmetrically distributed with small vari- 
ance around the mean value d k G [0, 1]. The expected payoff then is the (true 
empirical) weight w k = l — r k = 1 — dk -p k of action k and r k is its true risk. The 
weights are the quantities rational decision makers would base their decisions 
on. Approximations for them are learnt by the agents from the frequency and 
amount of previous damages they actually incurred. These are fed into a se- 
lection scheme which in turn determines the decision rule for action selection. 

The general feature of this example rendering the application of Bayes' rule 
less attractive is that the damage rate 1/pk and therefore the learning speed 
of any statistical inference rule for the set {r k } can be very low compared to 
the frequency with which an action has to be taken. Thus, initial probabilistic 
fluctuations could result in the (costly) selection of a non-optimal action for 
many steps. On the other hand, the expected damage d k can be close to 1, 
resulting in relatively high risk. Thus, the agent has high interest in using a 
reliable a priori risk estimate, to keep it stable against fluctuations, and still 
learn as quickly as possible. 

Numerous studies in algorithmic learning theory are concerned with opti- 
mal learning performance in adverse environments polluted with noise, see [22 1 
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FIGURE 1 . Prejudiced learning in the context of decision mak- 
ing under uncertainty. Shaded rectangles are internal states 
of the agent, unshaded ones stand for actual events. Framed 
rectangles are algorithms. 



and references therein. There, the strategies of the learner, and of its ad- 
versaries creating the noise, are rather elaborate. Our present approach dif- 
fers from these by focusing on constructive simplicity of the model and its be- 
havioural features rather than optimality. Furthermore, our adversary will be 
a single, very simple, noise model. 

3. Model Building 

Given these conflicting goals above, it can be sensible to use what we would 
like to call prejudiced learning system for weights w k , by which we mean 
the following. The agent is given a start value Wk (0) for its belief about the 
value of Wk ■ At time t it infers its (empirical) knowledge rjk (t) from the t+1 ob- 
servations made up to this time using some statistical learning rule not further 
specified, with the single requirement that r] k (£) asymptotically approaches the 
real value Wk as t — » oo. The agents then updates its belief w k about Wk at 
time t + 1 by prejudiced learning rule 

w k (t + l) = L((w k (s)) s < t ,T) k (t)), 4 = 0,1,... 

The essence of this constructive definition of prejudice is the heuristically 
sound distinction between knowledge and belief, close in spirit to (23). 

Figure[2shows a schematic view of the internal structure of the prejudiced 
agent and places it into the general context of decision making under uncer- 
tainty. The scope of our model below is merely the transition from knowledge 
to belief through a learning rule of the above kind, i.e., the three components in 
the lower right quadrant of the diagram. We will assume in particular that the 
reaction consists merely in informing the agent of its error, i.e., the numerical 
difference between its belief and the actual state of the world. Furthermore, 
since we consider the parameters w k to be independent from and Wk to be un- 
related to each other, we focus on learning of a single exterior parameter, and 
thus drop the index k from now on. 

Before proceeding, a caveat might be in order with respect to the terminol- 
ogy we introduced. The Webster's dictionary definition of the term "prejudice" 
is a "preconceived opinion, usually unfavourable; the holding of such an opin- 
ion; an unjustified and unreasonable bias". But then, "prejudiced learning" 
seems self-contradictory from the outset, and is certainly a controversial ex- 
pression. We nevertheless stick to this slight provocation, though the bare 
learning rule above and the three axioms set out below would seem to only jus- 
tify more cautious terms like "prudence". We shall see that at least one very 
simple model for "prejudiced learning" will show characteristics, like staying 
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away from the true value of the parameter to be learnt, usually attributed to 
prejudice proper. 

3.1. Axioms. The construction of the special model of a prejudiced learning 
rule we propose is based on three basic axioms and guided by the principle of 
simplicity. The latter means in particular that we consider only memoryless 
prejudiced learning, i.e., L = L(w(t),r](t)). Furthermore, we seek to render L 
in the simplest (functional) form possible. The axioms for the model are de- 
rived from the heuristic meaning of three special cases of learning situations, 
cases which can be viewed as constituting ancillary conditions under which a 
prejudiced rule has to function. In particular, b) and c) are meant to justify the 
attribution "learning rule", while a) exhibits an extreme case of "prejudice": 

a) Inability is preserved: The value w(0) = is taken to express the initial 
inability of the agent to perform the action. This must remain constant, i.e., 
w(0) = implies L(0,r?(t)) = 0. 

b) Importance spurs learning: The higher an agents ranks an action, 
i.e., the higher w(t), the faster it shall adapt its belief to its knowledge. That 
is, \L(w(t), rj(t)) - w(t) \ is monotonously increasing in w(t) for all rj(t). 

c) Truth is preserved: If the agent's belief equals its knowledge, then it is 
kept constant, i.e., L(r](t), r){t)) = r)(t). 

These axioms imply a fundamental asymmetry between high and low risk, 
which will reemerge in the behavioural patterns of any model fulfilling them, 
as will be seen in the particular case below. Heuristically, the axioms incorpo- 
rate a certain cautiousness, in that they tend to preserve a belief of high risk 
(low weight). Regarding axiom a), note that we formulated it in the least re- 
strictive way, since it does not exclude a value w(t) = at later times. The first 
axiom also entails a straightforward, implicit assumption on the decision rule, 
namely that actions with weight zero are not taken. 

3.2. Noiseless Case: The Logistic Learning Map. The combination of the 
axioms with the definition of prejudiced learning is certainly satisfied by nu- 
merous functional forms of learning rules. We now concentrate on a single spe- 
cial model which, because of its simplicity allows to derive some phenomeno- 
logical consequences which seem to be of general importance and might pertain 
also to more complex variants of prejudiced learning rules. It should however 
be noted that any conclusion drawn from this special model holds properly 
speaking only for this model and not for general prejudiced learning rules, 
even if they satisfy the axioms above. 

For the construction of a prejudiced learning rule satisfying the above ax- 
ioms, we specialise to the noiseless case, i.e., we assume the input rj(t) = w — 
1 - r to be identical to the true weight for all times. Expressed in the believed 
risk r(t) and its error A(<) = r — r(t), the simplest prejudiced learning rule is 
linear in either of these variables: 

A(t + 1) = a-r(t) ■ A{t), 

with a parameter a > 0. Re-expressed in the original variables it reads 

r(t + 1) = r — a ■ r(t) ■ (r — r(t)) or 
w(t + 1) = (1 - a) ■ w + a ■ w(t) • (1 - w(t) + w). 

This rule reflects the cautiousness inherent in the axioms, in tending to main- 
tain a prejudice of high risk (low weight). It satisfies conditions b) and c) and 
can be augmented by a simple condition on a to also satisfy a), see Section l3~31 
below. Two extremal cases for the parameter a are a = 0, in which case the 
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agent immediately learns the input value r, i.e., no prejudice is present, and 
a = l/r(0) prohibiting any learning. 

Introducing the relative error S(t) = A(i)/r the learning map reduces to 



This is the well known logistic map [ 24 , Section 7-4], whose characteristics are 
entirely determined by the bifurcation parameter p. It is usually considered as 
a self-mapping of the unit interval, but note that the natural domain for S is 
here [(r - l)/r, 1]. 

3.3. Classes of Behaviour. We assume the intrinsic parameters (a, r(0)) to 
be invariable characteristics of a given agent. Then, in the noiseless case, the 
behaviour of an agent is determined by these intrinsic parameters, and the 
environmental parameter r via p = or, We tentatively distinguish between 
three classes of agents, and denote them heuristically as follows. 

Adaptive (A): For p < 1, the logistic map has as a single, attractive fixed 
point. These agents are therefore bound to adapt to the external input r at an 
exponential rate. 

Stubborn (S): In the domain 1 < p < 3, the fixed point becomes unstable, 
and the unique stable fixed point is 8* = (p — l)/p, which is again approached 
at an exponential rate. Agents of this class are bound to persistently underes- 
timate the true risk to a certain degree. 

Uncertain (U): For 4 > p > 3, the logistic map exhibits a bifurcating tran- 
sition to deterministic chaos. Those agents exhibit an increasingly erratic be- 
haviour as p rises, which we subsume under the label 'uncertainty'. 

For p > 4, the logistic map is no longer a self-mapping of the unit interval. 
We simply ignore this case. 

We will see in Section l3~4l that behaviours of the last two classes S and U 
can, under reasonable assumptions, only occur if the agent initially underes- 
timates the risk r(0) < r. On the other hand, an initial value r(0) > r will 
lead to agents of class A, corroborating the heuristics that cautiousness, that 
is overestimation of the risk, entails a rather safe behaviour. 

3.4. Viability Conditions. In many senses, the logistic learning map is too 
simple to work properly. In particular, it does not satisfy axiom a), but this 
can be accomplished by adapting the intrinsic parameters (a, r(0)). The value 
r(0) = 1 implies a = 1. This condition, which we impose from now on, is a 
paradigm for what we call a viability condition. These are conditions on 
the intrinsic parameters that guarantee a proper functioning of the agents or 
improve their performance in a given environment. A biological heuristics for 
their prevalence is that agents not fulfilling them are naturally deselected. 

We use two other viability conditions. First, we restrict the range of admis- 
sible a-values for all agent classes to a < 1/V(0), to make the learning map 
contractive at least in the first time step, and to omit the extremal case of non- 
learning a = l/r(0) mentioned above. Though we could restrict the range of 
a further to avoid the occurrence of agents with p > 4, we refrain from posing 
the pertinent intrinsic condition a < 4, since this demand seems too restrictive 
for all classes. As said above, we ignore agents with resulting p > 4. 

The last condition concerns only agents of class A and is somewhat more 
severe in its consequences: For these agents, the fixed point 6* is negative and 
moreover repulsive. If it lies within the range [1 — 1/r, 1] of admissible ^-values, 




/„(*(*)) = P . <y(*)(i 



S(t)), with p = ar. 
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then the learning map would diverge to S — ► — oo whenever r(t) becomes < S*. 
Although this could be avoided in the noiseless case by admitting only start 
values r(0) > 5*, it would almost surely happen in noisy environments, e.g., 
the additive noise model of the following subsection. To prevent this disastrous 
effect we require a < 1 for A-agents as the third viability condition. This leads 
to 5* < 1 — 1 jr, pushing the repulsive fixed point out of the range of S. 

The viability conditions already have direct consequences for S- and I-agents. 
Due to the condition a < l/r(0), the case p > 1 can only occur if r/r(0) > p > 1, 
i.e., if the agent initially underestimates the risk. Furthermore, The higher 
values of p, and therefore the more complex behaviour patterns, emerge with 
increasing discrepancy between initial belief r(0) and knowledge r)(t) = r. We 
have given a heuristic interpretation of these features in Section I3~3l 

Although it would be desirable to dispense with the viability conditions by 
refining the learning rule, this does not seem easy at the given level of sim- 
plicity, without giving away other desired features. For instance, consider the 
straightforward attempt to adapt the parameter a with time to keep it < r(t), 
e.g., using the additional rule a(t + 1) = a(t)r(t)/r(t + 1), and thus force the 
learning map to be contractive at all times. This would certainly enable us to 
lift the viability condition a < 1 for A-agents. Yet, apart from necessitating at 
least a one-step memory for the prejudiced learner, it would also let the mod- 
ified agents become subject to fluctuations in r(t), something the heuristics 
for the construction of prejudiced learning rules suggests to avoid in the first 
place. We will propose a measure of the pertinent quality in SectionUJ and see 
that the logistic learning map amended by the viability conditions performs 
well with respect to it. 

As opposed to these linear boundary conditions in parameter space, we will 
employ boundary conditions in real space in the next subsection, when submit- 
ting the system to a noisy environment. 

We can now obtain a very coarse picture on the relative proportions in which 
an observer would expect the three behavioural classes to occur in a population 
of prejudiced learners. We calculate the a priori probabilities for an agent to 
belong to one of the classes, assuming that (r, r(0), p) are uniformly distributed 
in the domain (0, l) 2 x {0 < p < r/r(0)}, ignoring the last viability condition 
that affects only A-agents. Integrating over the pertinent ranges and discard- 
ing agents with a resulting p > 4 yields the ratios 4/5, 8/45, and 1/45 for class 
A, S, and I, respectively. Thus, this naive estimation renders adaptive be- 
haviour prevalent, while stubborn and uncertain behaviour occur with small 
but non-negligible probability. 

3.5. Adding Noise. We now assume that the input rj(t) of the prejudiced 
learning rule underlies additive, statistical fluctuations around the true risk 
value r, i.e., 

V (t) = r + Z(t), 

with a random variable S. As the simplest possible noise model we choose S to 
be symmetrically distributed with spread £ > around 0. That is, S is i.i.d. in 
[— £, £]. To keep r/ in [0, 1], this limits the range of admissible values for £ to 
< £ < min(r, 1 — r). Using relative coordinates 5(t) and = E(t) ■ p/r, we 
can separate the fluctuations from the learning map: 



S(t + 1) = f p ,s(5(t)) = f P m)) + *y ■ W - *(*)) " 1) 
= /„(«(*))+£(*)(*(*) -O- 
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Figure 2. Average error of A-agents. Numbers at top/bottom 
curves denote minimum/maximum values of E, between which 
the remaining curves interpolate in equidistant steps. 



This is an example of a random dynamical system (RDS). In this special case, it 
is a dynamical system with inhomogeneous noise, for which the inhomogeneity 
S(t) - 5* vanishes at the fix point. While systems with homogeneous, additive 
noise have been intensively studied, the inhomogeneous case is scarce in the 
literature. The additive separation of the noise from the dynamical mapping 
means, in particular, that we will still be able to make use of the tentative 
classification of Section for the noiseless case, to classify, at least partially, 
the behaviour of the agents under noise. 

The change of variable from 5 to £ renders £ an i.i.d. random variable with 
range [—a/2, a/2], where a = 2p/r ■ E. Therefore, the general bounds a < 
l/r(0) and E < min(r, 1 — r) imply a < 2/r(0) ■ min(r, 1 — r) in general, and 
a < 2p in dependence of p. On the other hand, the noise level a is limited to 
a < 2 min(r, 1 — r) for A-agents by the viability condition a < 1. 

Since the noise is in general non-vanishing at both boundaries of the do- 
main [(r — l)/?', 1] of 5, the functions f p ^ are in general not self-mappings of 
this domain, and thus need to be augmented by boundary conditions. For the 
performance analysis of A-agents in the next section we use boundary condi- 
tions of von Neumann type 

/^((r-l)/r) = (r-l)/r, / p , e (l) = 1, 

for all p < 1, £. That means that when the system hits the boundary, it remains 
there for one step and has, due to noise, a probability > 1/2 to leave it in the 
next step. 
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FIGURE 3. Average volatility of A-agents. 



4. Performance under Noise 

Our prime heuristic for the introduction of the limitation of rationality pre- 
sented by prejudice into learning was that agents can benefit from not fol- 
lowing stochastic fluctuations, in addition to minimising the errors they make. 
Therefore, the performance of prejudiced agents in a noisy environment should 
be assessed by considering at least the following two natural quantities. The 
first one is simply the average error A which is a measure for the deterioration 
of learning performance due to prejudice. However, we expect the learners to 
benefit from prejudice by capitalising on a reduction of the average volatility 
v = \r(t + 1) — r(t) \ of their belief, a variable which might for example be asso- 
ciated with an energy cost, perhaps arising from an energetic price the agents 
would have to pay for changing their selected action. We use these two vari- 
ables to analyse the performance of adaptive agents by numerical simulations. 

The benign neglect of S- and I-agents is motivated by their fundamentally 
different behaviour that hampers a quantitative comparison with A-agents. In 
particular, stubborn agents have a constant error A > and volatility 0. This 
persists even when they are subjected to noise, as will be seen, together with 
further qualitative features of the other agent classes, in SectionHO 

As a further restriction, following the heuristics that statistical learning is 
slow in comparison to the frequency of prejudiced learning and decision mak- 
ing, we consider the limiting case in which the fluctuation level S is constant, 
and thus ignore statistical learning altogether. Generically, we would suppose 
it to exhibit a slow decay, depending on the efficiency of the statistical learning 
rule and the characteristics of the environment. 
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Simulations took place in the parameter ranges 0<r<l,0<a<l, 
restricted by the viability conditions of Section 13.41 and < £ < min(r, 1 - 
r), determined by the noise model of Section f3.5l The results for error and 
volatility are shown in Figures|2]and|3l respectively, in which every data point 
represents an average over 2.5 x 10 6 time steps in 25 independent runs with 
random starting values r(0). 

It springs to the eye in Figure El that the error induced by noise is always 
negative, i.e., leads the agents to overestimate the risk. Furthermore, the er- 
ror is always small, hardly ever reaching 10 percent, and tends to be somewhat 
smaller above r = 0.5 than below. Thus, the heuristics on cautiousness inher- 
ent in the construction of the model is confirmed in this experimental setting. 

On the other hand, as the last two rows in Figure |H1 show, a stabilising 
mechanism of prejudiced learning becomes effective with increasing a, as v 
decreases from its unadulterated value 2/3S at a = 0. This is particularly true 
for higher r entailing a higher dynamical parameter p. In fact, as the graph for 
r = 0.9 exhibits, the learners become completely stable when p approaches 1, 
even in the presence of noise. This remarkable feature will be studied further 
in Section |5j 

Altogether, prejudice in learning has the potential to improve the perfor- 
mance of agents by reducing their volatility significantly, while not putting 
them far out in their risk estimation, or impeding the efficacy of learning too 
much. In particular, the learning rate, or rather the rate at which a stable 
equilibrium is approached, remains exponential. 

5. A Noisy Dynamical System 

5.1. Noise Induced Stability of Prejudiced Learning. In Section |4] we 
have seen that the prejudiced learning map becomes increasingly stable as 
p approaches 1, despite the presence of noise. In fact, the inhomogeneity of the 
noise, vanishing at the fixed point p*, leads us to the suspicion that this point 
plays a special role for the dynamics of the map at higher p. In this section 
we want to pursue this trail further, and consider the features of the map for 
p between 1 and 4, the upper limit at which the noiseless logistic map reaches 
full deterministic chaos. 

Here and in the subsequent numerical analysis we employ periodic bound- 
ary conditions, i.e., we consider the random RDS f p . a defined by 

x t+ i = f P ,£ t {x t ) = px t (l - x t ) +£,t{x t - x*) mod 1, 

with £ an i.i.d. random variable in [—a/2, a/2], and x* — (p — 1)//?- These 
boundary conditions make the presentation and the numerics somewhat sim- 
pler, while it turns out that they do not affect the behaviour of the system 
significantly. 

Simulations of this system show that, for many combinations of the dynam- 
ical parameter p and a nonzero noise level a, it rapidly approaches, and then 
stably remains at, the fixed point x* . In fact, this holds for all admissible noise 
levels a < 2p in the domain 1 < p < 3 corresponding to agents of class S. Yet 
the effect prevails even for a continuum of agents of class U, that is, in the 
parameter range 3 < p < 4, where the noiseless system undergoes Hopf bi- 
furcations into 2™ cycles on the route to deterministic chaos. An example for 
this stochastic resonance is shown in Figured), while FiguresEt) and d) show 
how bifurcating behaviour is restored through a stochastic bifurcation, when 
the noise level is lowered to leave the domain of the resonance (the stability 
domain). These two phenomena are examined in the following subsections. 
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FIGURE 4. Examples for evolutions at p = 3.08. 



5.2. Basics of Noisy Dynamics. Let us first introduce the necessary prereq- 
uisites on noisy, respectively, random dynamical systems. We rely on 1 25 1 as a 
primary source, and develop the material using the logistic learning map above 
as an example. Note that more refined theoretical tools for the treatment of 
such systems abound by now, see for instance 1 26 1, but are not needed in the 
analysis of the simple system at hand. 

Every RDS, including the present, is completely characterised by its transi- 
tion density V p , a (x, y), which yields the probability under / P)(T for ending up 
in an interval J c [0, 1] upon starting in / c [0, 1], by the formula 

Pr P ,a{I —>• J) = J J V p , a (x,y)dydx. 

Figure |5] shows an example of this density. Explicitly we find 

Vp^(x,y) = (a\x - a;*])" 1 • X[x-%\x-x*\,x+%\x-x*\](y), 

where \ is the characteristic function of an interval, and periodic boundary 
conditions are implicitly assumed on all variables. The transition density 
serves to define the Perron-Frobenius (P-F) operator of an RDS, a central 
tool in the system's analysis |7|. This operator acts on functions u e i 1 ([0, 1]), 
i.e., probability densities by 

PF(u)(y)= [ V p , a (f p (x),y)u(x)dx. 
Jo 

The eigenvalues and eigenvectors of PF are of key importance. In particu- 
lar, the positive and normalised eigenvectors to the highest eigenvalue 1, i.e., 
probability densities u with PF(u) = u, are called invariant densities of the 
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Figure 5. V%, 08,1.2 (left) and associated noise alone (right). 

system. An invariant density u defines an associated invariant measure p u 
by fJ, u (A) = J A u(x)dx, where A is any Lebesgue-measurable set. An important 
example of an invariant measure for the present system is /i£ „ generated by 
the the point measure 5 X * at x* , as an easy calculation shows, using the fact 
that V Pt a(x n , ■) is a ^.-sequence if x n — > x*. 

Yet more interest lies in the so-called physically significant or Bowen- 
Ruelle-Sinai (BRS) measures. A BRS-measure /^brs is defined for an ordi- 
nary, i.e., non-random dynamical system, defined by a deterministic mapping 
/, by the following property. There exists a subset U of the configuration space 
considered and with positive Lebesgue measure, such that for every continu- 
ous function tp, the following holds 

1 N ~ l f 

3=0 J 

for all starting points x e U, where f j denotes the jth iterate of /, see |27|. 
By the Birkhoff individual ergodic theorem, this property is always fulfilled 
for n-BRs-almost all x. The crucial strengthening of the hypothesis lies in the 
assumption that the ergodic hypothesis can be safely applied for all starting 
points a; in a set of positive Lebesgue measure. For a RDS, the above property 
must be formulated in the mean with respect to the stochastic perturbation, 
i.e., the noise. Since most physically interesting quantities of a dynamical sys- 
tem are time averages, an ergodic hypothesis is regularly invoked by physicists 
to calculate them by space averages. This makes the existence and uniqueness 
of BRS-measures an important theoretical issue in the study of random and 
ordinary dynamical systems. 

5.3. Stable Phase and Lyapunov Exponent. We are now in a position to 
bolster the heuristic conjecture of noise induced stability in the parameter 
range 3 < p < 4 with theoretical and numerical evidence. Furthermore, we 
want to determine the shape of the latter stable phase, i.e., the set of (p, a) 
values for which the point mass concentrated at x* generates the unique BRS- 
measure, that is pbrs = M<5 X . • 

The Lyapunov exponent A is central to the study of dynamical systems. 
It is the measure for the exponential rate at which distant starting points con- 
verge to an attractor under the system's evolution, respectively, the rate at 
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FIGURE 6. Stable Lyapunov exponent As- 



which nearby starting points become separated. In the former case, A is nega- 
tive, and in the latter positive, while zeros of A mark transitions in the system's 
behaviour. Lyapunov exponents can be used to detect stability, bifurcations, 
and the onset of chaos, and one of the simplest examples for their use is, again, 
the logistic map [24, Section 7-4]. It has been questioned whether Lyapunov 
exponents play the same role for RDS, since there are examples in which a 
positive A does not indicate ordinary chaotic behaviour 1281 1291 . Yet, a negative 
Lyapunov exponent always corresponds to stability, and therefore we chose to 
tentatively characterise the stable phase of our system through the property 
A < 0. The quantity itself is defined by the time average 

1 N ^ 



If a BRS-measure is known, A can be calculated, using an ergodic theorem, as 
a space average 

a = ^jT ]..|./;.c,-i| ( i /(BRS c,-i 

taking into account that in the noisy case we also have to average over the 
random variable £. 

The stable Lyapunov exponent As is now calculated under the assump- 
tion /-iBRS = (J>s x * • F° r the given noise model we obtain 

I r<?/2 /•! 

As(p,cr) = -/ / kL\f'Ax)\d(j, BBS (x)d£ 

G J -a/2 JO 

- 1 f 72 

° J -a/2 
r a/2 



ln|/;, f (z*M 

° J -a/2 

i r /2 

G J -a/2 
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FIGURE 7. Numerical evaluation of A(p, a). On a regular grid 
with resolution (0.0025, 0.05), each point represents 25 indepen- 
dent runs of length 10 6 , after omitting 10 4 initial iterations 



for p - 2 > a/2, and by a change of variable ( = p — 2 + £ we finally find, using 
the abbreviations A± = p -2±a/2 



As = / ^dC 

Jin A- a 



_ j In A 



In A_ 



Similar calculations in the two other cases p — 2 < a/2 and p — 2 = a/2 yield 
the net result 

( (lnA+ - l)mA+ - (lnA_ - l)lnA_, if A_ > 0; 
Xa(p,a-) = -<Qna-l)\na, ifA_=0; (*) 

° [(lnA + -l)lnA + + (ln|A_|-l)ln|A_|, if A_ < 0. 

Figure El displays this result. The solid curve in both pictures is the nodeline 
As = at which the hypothesis ^brs = Ms** breaks down, while the dotted line 
in the right picture is p — 2 = a/2, where the two solutions in 10 connect. The 
conjectured stability domain is marked by S. 

We confirm the existence of a stable phase empirically by collecting numeri- 
cal data for the Lyapunov exponent using time averages, obtaining the picture 
shown in Figure [7J We find that A is identical to As within the statistical error 
bounds for the most part of the region S. Yet for higher a > 3, the empiri- 
cal values of A are significantly larger than \$ along the inner boundary of S. 
It seems plausible that the strongly intermittent behaviour of the system in 
that region, of which Figure E^) shows an example, together with the periodic 
boundary conditions, in effect, prevents the convergence of A to As . 

5.4. Stochastic Bifurcation. The sample evolutions in Figure|4j))-d) exhibit 
the stochastic bifurcation the system undergoes when the noise level is lowered 
to leave the stable phase. We now examine this noise induced transition closer 
at p = 3.08. It occurs at the transition point a (p), which is obtained by numer- 
ically solving for the lower solution of Xs(p, &o{p)) = in equation yielding 
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ct (3.08) « 1.3683. As a approaches 0, the system converges to the deterministic 
2-cycle with attractor {p± = ±(p + 1 ± y/(p + l)(p - 3))}. 

Figure |H1 shows the development of the invariant densities for the BRS mea- 
sures at p = 3.08 as a varies. The data for a given a represents the distribution 
of M i.i.d. starting points after evolving N time steps. This statistical strategy 
can be justified if we assume that the Perron-Frobenius-Ruelle theorem |25| 
holds. It asserts, among others, that the iterates PF N (u) of any density u of 
a normalised measure which is absolutely continuous w.r.t. the BRS measure, 
converge to the density of the BRS measure as N — > oo. However, just above 
the transition point (marked by the solid vertical line) the BRS measure fig , is 
not well approximated with N = 1000, cf. FigureEp). Below cr , the singularity 
of the invariant density disappears, and it develops two maxima around a ps 1, 
which become separated at a « 0.75. Overall, the transition is a rather typical 
example for a stochastic bifurcation |8, Chapter 9]. 

Recently, it has been conjectured that stochastic bifurcations are in close 
correspondence to phase transitions in systems of statistical mechanics [ 30 1. 
Although the theoretical basis for such a claim is presently not firm, the two 
classes of phenomena exhibit many common features. One of them is symmetry 
breaking, which is also present in the stochastic bifurcations of our system. It 
can be exhibited by looking at the evolution of the return map during the 
bifurcation. This map is defined by the sequence of the local expansion rates 

Vt = ln |/p,s t Ot)| i 

and the statistical distribution of pairs (yt,yt+i) is its transition density, for 
which some examples are shown in Figure El The symmetry with respect to 
reflections on the diagonal that is present in the stable phase, is broken by the 
stochastic bifurcation. 
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FIGURE 9. Transition densities of the return map at p = 3.08, 
approximated on a regular 200 x 200 partition of [—2, 2] 2 using 
10 x 10 6 iterations, omitting 10 4 initial steps. Cells with abso- 
lute probabilities below 10~ 6 appear white. 

5.5. Critical Exponent. Finally, we want to further stress the analogy be- 
tween stochastic bifurcations and critical phenomena in statistical mechan- 
ics 1 3 1 1 . A characteristic of the latter is the vanishing or divergence of certain 
intensive observable quantities, called order parameters, at the critical point. 
Their values near the critical point are governed by universal scaling laws, 
which are quantitatively described by the so called critical exponents. 

As a very simple example demonstrating the general scheme, consider the 
first bifurcation of the noiseless logistic map at p c = 3. The only relevant 
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FIGURE 10. Left: Recurrence time statistics at r = 0; N = 

5 x 10 3 , m = 5 x 10 4 (+); N = 10 5 , m = 5 x 10 4 (o); N = 5 x 10 6 , 
m = 10 3 (□). Right: Correction for N = 10 5 , m = 5x 10 4 ; 
r re 0.00291 (□); t = (o); T P (+) and linear fit. 



order parameter is the Lyapunov exponent, which vanishes at p c with a critical 
exponent 

dcf lnAQ) 
7± = hm i i = 1, 
t^o± m |r| 

determined as the exponent of the leading power in the expansion of A in the 
scale-free parameter r = (p — p c )j ' p c . 

In our RDS, there are a number of other order parameters apart from A that 
can be considered, and we use the Poincare dimension of the critical point x* , 
which is a very natural measure for the collapse of the invariant density to 
8 X . at the critical point. It is determined by the recurrence time statistics as 
follows, cf. [32 1. The Pioncare recurrence time, which is the average time 
after which the system returns to a small domain B r {x*) = {x | \x — x* | < r} of 
radius r around x* , is given by 

dot . N 
Tp(r) = lim — ; ; — ; r-. 

v ; n^oo #{ Xl £ B r (x*), 1 < i < N} 

Obviously, this is not a scale-free quantity, so to obtain the desired order pa- 
rameter, one assumes that Tp behaves asymptotically as 

T P (r)(xr~ Df (r->0). 

The number Dp so defined is the Poincare dimension of x* . As the system 
enters the stable phase, with scale-free parameter r = (cr (p) -a)/a (p) tending 
to from above, Tp(r) approaches 1 for all r, and we expect Dp to vanish. 

In the following, we examine the transition at p = 3.08. Figure ITUl shows 
various examples for recurrence time statistics gathered by numerical simula- 
tions with m independent runs of length N, always omitting 10 4 initial steps. 
In the left picture we see three graphs for Tp at the critical point t — for 
various N. This exhibits the problem that, due to the combined effects of slow 
convergence of the system as such, and the additional numerical error which 
becomes relevant near the critical point, the measured Poincare dimension 
does not seem to converge to zero as N — ► oo (and does not depend on the 
sample size for m > 10 4 ). To circumvent this difficulty we tentatively replace 
T P at given r and N with 'corrected' values Tp gained by division by T P at r = 
and the same sampling time N, as shown in the right picture. A comparison 
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FIGURE 11. Poincare-dimensions and log-periodic fit at p = 3.08. 

of the 'corrected' Poincare dimensions Dp obtained from the Tp for different 
values of N, see Figure ITT1 shows that these quantities do not depend strongly 
on N, lending some justification to this approach. 

Remarkably it turns out that the behaviour of Dp near r = is not gov- 
erned by a simple power law. Recently it has been observed in a number of 
areas that singularities in many natural phenomena exhibit log-periodic os- 
cillations corresponding to complex critical exponents, see [ 33 , Section 3] and 
references therein. In such cases, one generally expects the considered observ- 
able to behave asymptotically like Rct' 3+1w = r' 3 cos(wlnr). Therefore, we fit 
the Ansatz 

D f (t) ocA + Bt p + Ct 13 cos(wlnr + 0) (r -> 0). 

to the dataset for Dp. This yields the continuous line in Figure FTP as the best 
fit graph, and the corresponding value of the critical exponent is 

= 1.18 ±0.05. 

Notice that the modification of the pure power law is rather small, as is ex- 
pressed by the ratio B/C w 3.3/0.25 = 13.2. Similar analysis for p = 3.12 
confirms that (3 is within the given error bounds. 

6. Conclusions 

Reiterating that general conclusions about prejudiced learning rules cannot, 
properly speaking, be derived by considering a single instance as we did above, 
we still want to note some of the indications which the detailed study of this 
special case provides us with. 

The model for prejudiced learning and behaviour presented above is rather 
plain by its construction, which was focused on a few fundamental aspects we 
deemed characteristic. Its utter simplicity, although of conceptual beauty, ap- 
pears as a drawback when its performance is critically assessed. For instance, 
it made the introduction of viability conditions necessary to ensure the desired 
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functionality of the model, see Section 13.41 These conditions are unsavoury 
in rendering the three different classes of prejudiced learners incomparable. 
Yet, they are at least intrinsic conditions that can be satisfied by excluding cer- 
tain values of the model's parameters from consideration. Totalling, the model 
itself calls for refinement to be applicable, and performant, in more realistic 
situations. Nevertheless, the basic performance results gathered in Section H] 
support the heuristics of model building inasmuch as the class of adaptive 
agents learns at exponential speed, and can benefit from prejudice in a noisy 
environment by reducing their volatility significantly. 

The most important phenomenological aspect of our model is doubtlessly 
the possible emergence of stubbornness as a secondary phenomenon of preju- 
dice. In fact, for p > 1, it becomes the prevalent behaviour in noisy environ- 
ments, aided by the mechanism of noise induced stability, a mechanism which 
is rather generic for nonlinear systems driven by noise and thus could well 
apply for other prejudiced learning rules than the logistic one. It must be em- 
phasised that stubbornness cannot be ruled out as not being performant and 
therefore rare in realistic ensembles of prejudiced learners. The only condi- 
tion for stubbornness to appear likely in individual agents is that the risk is 
initially underestimated (by the pertinent viability condition) and high, since 
p = ar. Stubbornness thus is a high risk phenomenon. One scenario particu- 
larly catches the imagination. Assume for the moment that we have removed 
the awkward viability condition a < 1 for adaptive agents, e.g., by replacing it 
with suitable boundary conditions. Then, agents of class A can be pushed into 
classes S or U by a sudden rise of risk, and consequently become stubborn. 

The expectation to find the logistic prejudiced learning rule as such realised 
in learning systems and environments as complex as human beings and hu- 
man societies is doubtful. Yet it might still be a candidate model for the de- 
scription of certain behavioural aspects of biological systems from the level of 
single-cells to that of plants and lower animals. The thorough adherence to the 
principle of simplicity in the construction of the model, realised in the rather 
reduced game-theoretic framework, the minimal set of axioms, and finally the 
learning rule itself, in particular its memorylessness, are in favour of that view. 
Furthermore, the rich phenomenology of the model can aid its identification in 
such systems through the provision of many indicators — stubbornness being 
a prime one, alongside the characteristic adaptive behaviour with attenuated 
volatility. 

Let us conclude by indicating some directions for further research, and pos- 
ing some open questions. 

In this first study, we have only considered static environments. One step 
to a more realistic model would be to include the effect of a statistical learning 
rule in it. This would generally lead to a slowly decreasing noise level, and 
in turn could let agents undergo behavioural changes. For example in class 
U, lowering the noise leads to cyclic behaviour ('evasive') through stochastic 
bifurcations. 

It would further be interesting, along the lines sketched above, to improve 
the prejudiced learning rule itself by adding dynamical features. Within the 
framework of our model, making the prejudice parameter a dynamic naturally 
stands to reason, although too simplistic approaches would be inappropriate, 
cf. Section \'6.4\ In view of the heuristics for the performance of prejudiced 
learners developed in SectionHJit would seem promising, for example, to lower 
a when the agent has lived through a period of low volatility in the recent 



20 



A. U. SCHMIDT 



past. This modification would in particular pertain to S-agents, whose be- 
haviour would then eventually become adaptive after some time in most cases, 
unlike S-agents with fixed a, and p > 3, which normally enter the uncertain 
(bifurcating) mode when the noise level decreases. 

While we have concentrated here on the behaviour and performance of an 
individual prejudiced learner, studying the effects their presence will have at 
the community level in a society of learning agents is an important playground 
for further research. Good objects for such a study would be small-world net- 
works |34j|35j|36l[37 1 as models for propagation of information. A prime ques- 
tion here is to what extent a proportion of prejudiced learners can stabilise 
beliefs in the given society. Studying these issues is work in progress. 

This also, and finally, leads us to formulate some questions of an evolution- 
ary kind. What quantitative relations between agents having various intrinsic 
parameters, i.e., belonging to different classes, would emanate through evolu- 
tionary pressure, over many generations of learners? The viability conditions 
of Section l3~31 reflect our presumptions on the net effect of evolutionary selec- 
tion on the distribution of (a, r(0)), and for A-agents we have seen in Section^ 
that they have, in principle, a good chance to compete. Yet both the former 
presumptions and the latter claim still have to stand the test of more realistic 
models, including concrete performance measures and evolutionary selection 
schemes. 
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